Goto

Collaborating Authors

 nvidia gpus


KAITIAN: A Unified Communication Framework for Enabling Efficient Collaboration Across Heterogeneous Accelerators in Embodied AI Systems

Lin, Jieke, Wang, Wanyu, Yin, Longxiang, Han, Yinhe

arXiv.org Artificial Intelligence

Embodied Artificial Intelligence (AI) systems, such as autonomous robots and intelligent vehicles, are increasingly reliant on diverse heterogeneous accelerators (e.g., GPGPUs, NPUs, FPGAs) to meet stringent real-time processing and energy-efficiency demands. However, the proliferation of vendor-specific proprietary communication libraries creates significant interoperability barriers, hindering seamless collaboration between different accelerator types and leading to suboptimal resource utilization and performance bottlenecks in distributed AI workloads. This paper introduces KAITIAN, a novel distributed communication framework designed to bridge this gap. KAITIAN provides a unified abstraction layer that intelligently integrates vendor-optimized communication libraries for intra-group efficiency with general-purpose communication protocols for inter-group interoperability. Crucially, it incorporates a load-adaptive scheduling mechanism that dynamically balances computational tasks across heterogeneous devices based on their real-time performance characteristics. Implemented as an extension to PyTorch and rigorously evaluated on a testbed featuring NVIDIA GPUs and Cambricon MLUs, KAITIAN demonstrates significant improvements in resource utilization and scalability for distributed training tasks. Experimental results show that KAITIAN can accelerate training time by up to 42% compared to baseline homogeneous systems, while incurring minimal communication overhead (2.8--4.3%) and maintaining model accuracy. KAITIAN paves the way for more flexible and powerful heterogeneous computing in complex embodied AI applications.


US seeks to thwart smuggling of Nvidia GPUs with location tracking

PCWorld

The United States has reportedly been investigating reports that Nvidia GPUs have landed illegally in China to be used by Chinese LLMs like DeepSeek, and one US lawmaker will be introducing a new bill that aims to track the locations of AI chips--like the ones made by Nvidia--after they're sold, reports Reuters and Neowin. The smuggling of CPUs and GPUs is nothing new. PC components have often been smuggled across the ocean to countries like China and other East Asian countries for years. But with the rising power of AI and the implications of AI on technological prowess, it's not unusual that the US government wouldn't want that tech falling into rival hands. The proposed legislation would oblige US authorities to develop regulations for location verification of AI chips.


The evolution of AI: From AlphaGo to AI agents, physical AI, and beyond

MIT Technology Review

The release of ChatGPT by OpenAI in November 2022 marked another significant milestone in the evolution of AI. ChatGPT, a large language model capable of generating human-like text, demonstrated the potential of AI to understand and generate natural language. This capability opened up new possibilities for AI applications, from customer service to content creation. The world responded to ChatGPT with a mix of awe and excitement, recognizing the potential of AI to transform how humans communicate and interact with technology to enhance our lives. Today, the rise of agentic AI -- systems capable of advanced reasoning and task execution -- is revolutionizing the way organizations operate.


Data gold rush: companies once focused on mining cryptocurrency pivot to generative AI

The Guardian

Since generative AI exploded into global consciousness in 2023, an unprecedented demand for computing power has emerged alongside the demand for apps utilising the technology. Tool's like OpenAI's ChatGPT require thousands of Nvidia GPUs (graphics processing units) to smoothly process all the information being fed in and output. Nvidia last week compared GPUs to rare earth metals for AI, saying they're "foundational" for the operation of generative AI today. The energy required to power all this hardware is the equivalent of a small country, according to a report released by French energy company Schneider Electric last year. On Wednesday OpenAI's CEO, Sam Altman, told an audience at Davos that an energy breakthrough was needed to power AI advances.


It Was Founded in a Denny's. Now It's Worth More Than Facebook.

Slate

Nvidia, the company that dominates the market for graphics processing units, was once known mostly in the video game world. But these days, Nvidia GPUs are also the go-to source for the massive computing power needed to run generative A.I. systems--and the recent explosion in A.I. hype has propelled the company's stock into the stratosphere. Nvidia briefly hit a trillion-dollar valuation, putting itself in league with tech giants like Alphabet and Apple and launching a bit of a frenzy in the markets. Nvidia is looking like the first big stock win of the A.I. era, and investors are salivating. On Sunday's episode of What Next: TBD, I spoke with Don Clark, a freelance reporter who specializes in the chips industry, about how Nvidia rode the A.I. revolution, became the hottest chipmaker in the world, and made the entire A.I. craze suddenly seem very real.


Microsoft to showcase purpose-built AI infrastructure at NVIDIA GTC

#artificialintelligence

Join Microsoft at NVIDIA GTC, a free online global technology conference (GTC), March 20 to 23 to learn how organizations of any size can power AI innovation with purpose-built cloud infrastructure from Microsoft. Microsoft's Azure AI supercomputing infrastructure is uniquely designed for AI workloads and helps build and train some of the industry's most advanced AI solutions. From data preparation to model and infrastructure performance management, Azure's comprehensive portfolio of powerful and massively scalable GPU-accelerated virtual machines (VMs) and seamless integration with services like Azure Batch and open-source solutions helps streamline management and automation of large AI models and infrastructure. Attend NVIDIA GTC to discover how Azure AI infrastructure optimized for AI performance can deliver speed and scale in the cloud and help you reduce the complexity of building, training, and bringing AI models into production. Don't miss session S52469 featuring Nidhi Chappell, a recipient of the 2023 People to Watch, recognized as a high-performance computing (HPC) luminary by HPCwire.


The Best GPUs for Deep Learning in 2023 -- An In-depth Analysis

#artificialintelligence

Deep learning is a field with intense computational requirements, and your choice of GPU will fundamentally determine your deep learning experience. But what features are important if you want to buy a new GPU? How to make a cost-efficient choice? This blog post will delve into these questions, tackle common misconceptions, give you an intuitive understanding of how to think about GPUs, and will lend you advice, which will help you to make a choice that is right for you. This blog post is designed to give you different levels of understanding of GPUs and the new Ampere series GPUs from NVIDIA. You have the choice: (1) If you are not interested in the details of how GPUs work, what makes a GPU fast compared to a CPU, and what is unique about the new NVIDIA RTX 40 Ampere series, you can skip right to the performance and performance per dollar charts and the recommendation section. The cost/performance numbers form the core of the blog post and the content surrounding it explains the details of what makes up GPU performance. You might want to skip a section or two based on your understanding of the presented topics. This blog post is structured in the following way. First, I will explain what makes a GPU fast. I will discuss CPUs vs GPUs, Tensor Cores, memory bandwidth, and the memory hierarchy of GPUs and how these relate to deep learning performance. These explanations might help you get a more intuitive sense of what to look for in a GPU. I discuss the unique features of the new NVIDIA RTX 40 Ampere GPU series that are worth considering if you buy a GPU. From there, I make GPU recommendations for different scenarios. After that follows a Q&A section of common questions posed to me in Twitter threads; in that section, I will also address common misconceptions and some miscellaneous issues, such as cloud vs desktop, cooling, AMD vs NVIDIA, and others. If you use GPUs frequently, it is useful to understand how they work. This knowledge will help you to undstand cases where are GPUs fast or slow. In turn, you might be able to understand better why you need a GPU in the first place and how other future hardware options might be able to compete. You can skip this section if you just want the useful performance numbers and arguments to help you decide which GPU to buy.


Hands-on Machine Learning with AWS and NVIDIA

#artificialintelligence

Machine learning (ML) projects can be complex, tedious, and time consuming. AWS and NVIDIA solve this challenge with fast, effective, and easy-to-use capabilities for your ML project. This course is designed for ML practitioners, including data scientists and developers, who have a working knowledge of machine learning workflows. In this course, you will gain hands-on experience on building, training, and deploying scalable machine learning models with Amazon SageMaker and Amazon EC2 instances powered by NVIDIA GPUs. Amazon SageMaker helps data scientists and developers prepare, build, train, and deploy high-quality ML models quickly by bringing together a broad set of capabilities purpose-built for ML.


Microsoft Shares What's Next In Machine Learning At NVIDIA GTC - Liwaiwai

#artificialintelligence

Finding scalable solutions for today's global challenges requires forward-thinking, transformative tools. As environmental, economic, and public health concerns mount, Microsoft Azure is addressing these challenges head on with high-performance computing (HPC), AI, and machine learning. The behind-the-scenes power for everything from MRI scans to energy management and financial services, these technologies are equipping customers and developers with innovative solutions that break through the boundaries of what's possible in data and compute, paving the way for growth opportunities that span industries and applications around the world. Microsoft Azure is committed to unlocking these new opportunities for our customers, providing the broadest range of NVIDIA GPUs at the edge, on-premises, in the cloud, and for hybrid environments. At NVIDIA GTC we will demonstrate this commitment by showing how Azure's advanced HPC capabilities, and AI/machine learning in the cloud are driving transformation and making an impact together with NVIDIA's latest technology.


Optimizing TF, XLA and JAX for LLM Training on NVIDIA GPUs

#artificialintelligence

Posted by Douglas Yarrington (Google TPgM), James Rubin (Google PM), Neal Vaidya (NVIDIA TME), Jay Rodge (NVIDIA PMM)Together, NVIDIA and Google are delighted to announce new milestones and plans to optimize TensorFlow and JAX for the Ampere and recently announced Hopper GPU architectures by leveraging the power of XLA: a performant, flexible and extensible ML compiler built by Google.